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Abstract 

The purpose of this study was to examine how mid-term student ratings feedback and 
consultation on instructional practices affected teaching practices, ratings of teaching 
effectiveness, learning and motivation. Thirty-seven teaching assistants (TAs) for undergraduate 
computer science and chemistry courses were randomly assigned to either a feedback/ 
consultation group or a control group. TAs in the treatment group received feedback and 
consultation on mid-term student ratings from the Instructional Activities Feedback Form 
(IAFF), which assesses the use of the instructional activities described in a model of instruction. 
Final student rating results revealed significant differences (p< .05) in favor of the 
feedback/consultation group on teaching practices, ratings of teaching effectiveness, and student 
motivation. Significant positive relationships (p< .01) between use of the instructional activities 
and final exam scores were also evident. These findings suggest that both United States and 
International TAs can use student ratings feedback to improve teaching practice. Suggestions for 
future research and practice using the IAFF and consultation process are also discussed. 
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From Theory to Practice: Using an Instructional Theory to Provide Feedback and Consultation to 
Improve College Teaching, Learning and Motivation 

In recent years at large universities in the United States, the percentage of lower-level 
undergraduate courses taught by teaching assistants has risen to over 50% (Allen & Rueter, 

1990; Curry et al., 1999). Unfortunately, however, only half of the teaching assistants (TAs) 
teaching such courses receive formal training to prepare them for their teaching responsibilities 
(Gray & Buerkel-Rothfuss, 1991), and the majority of this training is in the form of brief 
workshops before each term (Ronkowski, 1989). Thus, even the training that TAs receive is 
quite limited. 

After they begin teaching their classes, TAs rarely receive guidance from experienced 
teachers (Moore, 1996). McMillen’s (1986) survey of 600 TAs across the country revealed that 
50% of TAs were never observed teaching by a faculty member. It is unfortunate that our 
institutions of higher learning often place the graduate teaching assistant in front of 
undergraduate students with little or no training or feedback about how to teach those students. 

Student ratings feedback may offer an efficient and effective means of providing TAs 
with valid feedback on their instructional practices. Two extensive meta-analyses (Cohen, 1980; 
L’Hommedieu, Menges & Brinko, 1990) have shown significant improvements in teaching 
effectiveness and student affect for those teachers who receive mid-term student ratings 
feedback, especially when that feedback is combined with consultation. However, researcher 
examining the effects of such feedback have expressed concern that the ratings instruments used 
to provide feedback are generally surveys of broad teaching constructs or behaviors that are 
found on typical end of semester student ratings forms (McKeachie, 1997; Murray, 1983, 1997). 
They indicate that such instruments do not provide instructors with information regarding the 
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specific instructional activities they need to improve. To overcome this problem, several 
researchers (Abrami, d’Apollonia, & Cohen, 90; Cranton & Smith, 1990; d’Apollonia & Abrami, 
1997; Marsh, 1984) have suggested linking ratings instruments to theories of teaching or 
learning. 

A relevant theory to link to a rating instrument intended to improve instruction is Gagne’s 
(1985) theory of instruction. Based on the information-processing learning theory, Gagne’s 
theory describes fundamental principles for designing and delivering effective instruction while 
incorporating nine events of instruction. 

Building on Gagne’s theory, Reiser and Dick (1996) developed an instructional model for 
teachers, which incorporates most of Gagne’s nine events of instruction. The six major 
instructional activities in this model are: Motivating students; informing students of objectives; 
helping students recall prerequisite knowledge; presenting information and examples; providing 
practice and feedback; and summarizing the lesson. This model of instruction was used as the 
basis for the Instructional Activities Feedback Form (Hampton and Reiser, 2000), the student 
rating instrument employed in this study. 

The Instructional Activities Feedback Form (IAFF) allows students to provide feedback 
to their teachers about specific instructional practices related to each of the six instructional 
activities in the Reiser and Dick model. Thus, the information students provide when they 
respond to the items on the IAFF can be tied directly to particular types of instructional activities 
in the classroom. The complete IAFF can be found in the Appendix. 

As mentioned earlier, previous research has demonstrated that mid-term student ratings 
feedback is particularly effective when it is combined with consultation (Cohen, 1980; 
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L’Hommedieu et al., 1990). Therefore, we developed a consultation process to aid TAs in 
interpreting and implementing the IAFF student ratings feedback. 

Several researchers have developed and assessed consultation procedures based on 
established models of consultation (Brinko, 1988, 1990; Orban, 1981; Rutt, 1979, Wilson, 1986). 
These models fall on a continuum from being very directive and consultant driven to being less 
directive and client-centered. Brinko (1990; 1991) studied 10 instructional consultants from 8 
universities and found that consultants used more of a prescriptive approach for novice teachers, 
such as TAs, because they needed more guidance due to a lack of teaching experience. 

The consultation approach employed in this study was a prescriptive approach similar to 
one described by Brinko (1991). During the first phase, establish the relationship, the consultant 
and each TA met to establish rapport, discuss expectations, and arrange a schedule for data 
collection and review. In the second phase, data collection, which took place at mid-semester, the 
Instructional Activities Feedback Form (IAFF) was used to gather student reactions to the 
instructional practices employed by the TA. During this phase the consultant also observed the 
TA teaching a class and took notes on the teaching practices the TA employed. During the third 
phase, the conference, the TA and consultant met to review the feedback the students and 
consultant provided via the IAFF and classroom observation and how that feedback might be 
used to help improve the TA’s instructional practices. The fourth phase, follow-up, is a follow- 
up observation and meeting to verify that the TA understood how to implement the activities in 
the classroom. 

The purpose of this study was to examine how the midterm feedback students provided to 
TAs via the IAFF, combined with the consultation process described above, affected TA use of 
various instructional activities, student ratings of teaching effectiveness of the TAs, and the 
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learning and motivation of the students. TAs in the treatment group received feedback and 
consultation on the midterm IAFF student ratings while TAs in the control group did not receive 
any feedback or consultation. Inasmuch as the feedback and consultation TAs received was 
directly related to the six instructional activities examined in this study, it was expected that the 
TAs who received this feedback and consultation subsequently would make greater use of those 
activities than the TAs in the control group. Moreover, as previous research has indicated 
(Cohen, 1980; L’Hommedieu, et al., 1990), it was expected that those TAs who received 
feedback and consultation would subsequently receive higher ratings of teaching effectiveness 
than their counterparts who did not receive such feedback. 

It was also hypothesized that students of TAs in the treatment group would learn more 
and have higher levels of student motivation toward the course than students of TAs in the 
control group. This was based on previous research, which revealed that students of TAs who 
received feedback and consultation had higher final exam scores (Overall & Marsh, 1979) and 
more positive attitudes toward the course content (Erickson & Sheehan, 1976; McKeachie, et. al, 
1980; Overall & Marsh, 1979) than students whose TAs did not receive feedback and 
consultation. 

Finally, it was hypothesized that TA utilization of the instructional activities examined in 
this study would be positively related to student learning. Feldman’s (1989) thorough reanalysis 
of Cohen’s (1981, 1987) meta-analyses revealed rather strong correlations between certain 
teaching practices (or dimensions) and student learning within multi-section courses. Many of 
the 28 practices Feldman analyzed can be categorized into one of the six types of instructional 
activities we were examining. 
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Method 



Participants 

The primary participants in this study were 37 TAs who were teaching one of two lower- 
division courses at a large southeastern university. Twenty TAs taught a computer literacy course 
that was a required class for every undergraduate student attending the university. The other 17 
TAs taught a basic chemistry lab course that was a required course for all undergraduate 
chemistry majors. All TAs from both courses volunteered to participate in the study. 

Seventeen TAs were from the United States and 20 were from other countries. The 
computer literacy department had seven international TAs in each treatment condition, and the 
chemistry department had three international TAs in each treatment condition. Most of the TAs 
had between 5 to 8 hours of teacher training and 1 .6 semesters of teaching experience, however, 
14 TAs had no prior teaching experience. There were no significant differences between TA 
teaching experience, F (1,35) = .1 15,p = .737, or teacher training, F=.00S,p = .930, between the 
treatment groups. 

The TAs in this study taught a total of 93 sections of classes. The students in these 
sections were the other participants in the study. The number of students per section ranged from 
15-24, and the total number of students involved in the study was approximately 1600. All of the 
students who participated in the study were in their freshman or sophomore year. 

We attempted to assess the initial equivalency of the students in the two treatments by 
conducting two analyses. Inasmuch as previous research in this area (Cohen, 1981; Overall & 
Marsh, 1979) had used math SAT scores to assess initial differences between groups, we 
examined this variable and found that there was no significant difference in the math SAT scores 
of the students in the two groups, F(l,35) = 1.738, p = .196. In addition, we found that there was 
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no significant difference between the students in the two groups with regard to their level of 
interest in the course content at the beginning of the semester, F (1,35) = .114, p = .738. 

Independent Variable 

The independent variable examined in this study was midterm student ratings feedback 
and consultation related to the six instructional activities described by Reiser and Dick (1996). 
The two levels of this variable were no feedback on the midterm student ratings (control group) 
and written feedback with personal consultation on the midterm student ratings (treatment 
group). 

The written feedback summarized how students responded to each of the 38 items on the 
Instructional Activities Feedback Form (IAFF). Twenty items on this form were designed to 
student perceptions of the frequency with which their TA employed each of the six instructional 
activities examined in this study. As suggested by reviewers of student ratings literature 
(L’Hommedieu et al., 1990, Marsh, 1984; McKeachie, 1997), two to five items on the IAFF 
were used to assess the TA’s utilization of each instructional activity. For example, one question 
assessing the ‘motivating students’ activity asked students to indicate how often, (1 for almost 
never to 5 for almost always) the TA began the lesson with an interesting or exciting fact, 
demonstration, or question. 

In addition to measuring the frequency with which TAs used various instructional 
activities, the IAFF also included four items designed to measure student motivation toward the 
course and one item that asked students to rate, on a five point scale, the overall teaching 
effectiveness of their TA. More information about the IAFF, which was also administered at the 
end of the semester as a measure of some of the dependent variables examined in this study, is 
contained in the next section of this paper. 
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The midterm written feedback received by each TA listed, for each of the first 36 items 
on the IAFF, the mean score the TA received on that item, as well as well the mean score on that 
item for all of the TAs in the study. Written feedback regarding the last two items on the survey, 
which were open-ended questions regarding what students liked and disliked about the TA’s 
instructional practices, categorized all of the open-ended responses according to the six 
instructional activities examined in this study. 

The personal consultation aspect of the independent variable involved observing and 
consulting with each TA in the treatment group twice during the semester. The first observation 
session took place between the second and fifth week of the semester. During this observation 
session, the researcher listed specific examples of a TA’s use of each of the six instructional 
activities examined in this study. If the TA did not use one or more of the instructional activities, 
the researcher listed strategies the TA could have used to incorporate those activities into the 
classroom session. 

The first consultation session, or ‘conference,’ took place during the seventh week of the 
semester, shortly after the first administration of the IAFF. During this session, which lasted for 
approximately 75 minutes, the researcher helped the TA review his or her ratings on the IAFF, 
focusing on how the TA could improve those instructional activities for which the TA had 
received relatively poor ratings. The notes the researcher had taken during the first observation 
session served as the basis for most of the suggestions the researcher offered. 

The second observation session, which was much like the first, took place during the 
eighth or ninth week of the semester. Immediately after this session, the second consultation, or 
follow-up session, took place. This session lasted approximately 15 minutes, during which time 
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the researcher provided further suggestions as to how the TA might improve his or her 
instructional activities. 

Dependent Variables 

The dependent variables for this study included (a) the extent to which the TAs employed 
each of the six instructional activities examined in this study, (b) student ratings of the overall 
teaching effectiveness of the TAs, (c) student learning, and (d) student motivation in the course. 

The extent to which the TAs employed each of the six instructional activities was 
measured by the items on the IAFF that asked students to rate TA use of those activities. There 
were between two to five items related to each of the six activities, and the Cronbach Alpha 
reliability coefficient for these six sets of items ranged from r = .83 to .89. The construct validity 
for this portion of the IAFF was also assessed. An exploratory factor analysis (Crocker & Algina, 
1986), yielded a five- factor structure paralleling five of the six instructional activities the IAFF 
was designed to assess. The ‘objectives’ and ‘present information’ items loaded on the same 
factor. 

As is commonly done in the student ratings literature (L’Hommedieu, 1990), students 
rated the overall teaching effectiveness of their TA by responding to one item on the IAFF, 
which asked: “Overall, how effective would you rate your teacher’s instruction?” The 5-point 
Likert scale ranged from ‘poor’ to ‘excellent.’ All student ratings for each TA in each treatment 
group were combined to produce a mean teaching effectiveness score for all the TAs in that 
group. 

Student learning was measured by averaging the final exam scores in each TA’s section. 
In both the chemistry and computer literacy courses, the final exam measured student acquisition 
of the course material that was covered during the second half of the term, after the intervention 
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took place. This material was outlined in the prescribed syllabus for each course. Each TA in the 
chemistry lab course taught students the same common lab procedures and introductory ‘hands- 
on’ lab exercises. Each TA in the computer literacy course taught the same basic computer 
applications such as PowerPoint, Microsoft Word, and Excel. The reliability of the chemistry lab 
final exam was r = .41 , and the reliability of the computer literacy final exam r = was .59. 

Finally, student motivation was measured by four items on the IAFF, each of which measured 
one of the four components (attention, relevance, confidence and satisfaction) of Keller’s ARCS 
model (1987a, 1987b). The reliability of this section of the IAFF was r= .80. 

Procedures 

The TAs from each department, 20 in computer literacy and 17 in chemistry, were 
randomly assigned to either the feedback and consultation (treatment) group or the control group. 
TAs did not know which group they were assigned to until after the first observation for the 
feedback group and the midterm survey administration for the control group. 

Students in treatment and control groups responded to the items on the IAFF for the first 
time in the 6 th week of the term. A student in each class section administered and collected the 
confidential surveys at the beginning of class by reading a script provided by the researcher. As 
noted earlier, the researcher observed each TA in the treatment group twice during the semester, 
once during the 2 nd - 5 th week of the 1 5-week term and again in the 8 th or 9 th week. The 
researcher also consulted with each TA twice. The initial session taking place approximately one 
week after the students completed the IAFF for the first time, with the second consultation 
session taking place approximately two weeks after the initial one. TAs in the control group were 
not observed or consulted with because these procedures were part of the experimental treatment. 
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At the end of the semester, students in the treatment and control groups took the final 
exam for their course and responded to the items on the IAFF for a second time. As was 
previously the case, this confidential survey was administered and collected by a student in each 
class section. 

After the semester was over, the researcher met with all the TAs in the study to provide 
them with the results from the second administration of the IAFF. At the conclusion of this 
session, the TAs completed an attitude survey designed to provide the researcher with feedback 
on the utility of the IAFF and the consultation process. 

Research Design and Data Analysis 

A 2X2 Analysis of Covariance (ANCOVA), a technique common in the student ratings 
literature (Cohen, 1980; L’Hommedieu et al., 1990) was used to assess the effects of the 
independent variable on the dependent variables and also test for any interactions between the 
chemistry and computer literacy departments. To assess the effects of the intervention on student 
learning, however, a One-Way ANCOVA was used to examine effects within each academic 
course separately. The TA was the unit of analysis except where noted. 

We also examined the relationships between the use of the instructional activities and the 
dependent variables, such as learning outcomes, ratings of teaching effectiveness, and student 
motivation, using Pearson Product Moment correlations. All hypotheses and analysis discussed 
were apriori, except where noted. 

Results 

Teaching Practices 

We hypothesized that feedback and consultation on the IAFF would positively affect 
TAs’ subsequent use of the teaching practices in the classroom as evident by the final student 
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ratings. The adjusted mean final assessment scores, based on the initial assessment scores 
(covariate), for each group are presented in Table 1. The ANCOVA helped reduce error variance 
and statistically controlled for the minor (non-significant) initial differences between groups on 
their ratings of each teaching practice before the intervention as suggested by reviewers of the 
student ratings literature (L’Hommedieu et al., 1990). The total score on the IAFF represents the 
mean score of items 1 through 20. As evident in Table 1, students rated TAs in the treatment 
group significantly higher,/) < .01, than TAs in the control group on all six activities or practices. 
Higher ratings mean that students noticed the TAs using a particular instructional activity more 
frequently. 

A 2 X 2 ANCOVA examined the main effects between the feedback and control group 
and possible interactions between the chemistry and computer science departments. The 
covariate was significant, p < .001, for each analysis. The Bonferroni procedure (Kirk, 1995) was 
also used to protect for inflation of the family-wise error rate for the examination of the six 
separate activities and the total score. The interaction between the treatment group and 
department was not significant for any comparisons. 

Results of the ANCOVAs are also presented in Table 1 and illustrate that the feedback 
group received significantly higher ratings,/? < .01, on all six activities and the total score, which 
supported the hypothesis on teaching practices. The Eta 2 results in Table 1 also illustrate the 
proportion of variance attributed to the treatment, which was fairly large (ranging from .36 to 
.48) for each of the six instructional activities. The effect sizes (Cohen, 1988) for each activity 
ranged from d = .53 for the practice and feedback activity to d = .90 for the motivating students 
activity. The effect size for the total score, combining all six activities, was d = .77. 
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Teaching Effectiveness 

The hypothesis that TAs in the treatment group would be rated higher than TAs in the 
control group would be rated in teaching effectiveness was also supported. For the teaching 
effectiveness item on the IAFF, the adjusted mean scores and standard deviations for each group 
on the final assessment were 3.26 ( SD = .64) for the control group (N = 19) and 3.58 ( SD = .35) 
for the feedback group (N = 18). The initial assessment score for each TA on the same item was 
used as the covariate to produce the adjusted mean score. By department, the adjusted mean 
scores for both groups are shown in Figure 1. 

The ANCOVA revealed that TAs in the feedback group were rated significantly higher in 
teaching effectiveness than TAs in the control group, F( 1,32) = 15.410,/? < .01. The interaction 
between the treatment group and department was also significant, F( 1,32) = 5.235,/? = .029. 
Figure 1 illustrates the interaction between the treatment group and the department for teaching 
effectiveness. The positive effect of the interventions’ influence on teaching effectiveness ratings 
was significantly greater for TAs in the chemistry department than TAs in the computer literacy 
department. Analysis of the simple main effects within each department revealed significant 
differences in favor of the treatment group over the control group in the chemistry department, F 
(1,32) = 17.52,/? < .01. The differences between groups within the computer literacy department, 
however, was not statistically significant, F(l,32) = 1.44,/? = .239. 

Student Learning 

The hypothesis that the intervention would positively affect student learning was not 
supported. Using the TA as the unit of analysis, as suggested by reviewers (Marsh & Roche, 
1997; Feldman, 1989), all student scores were consolidated to provide an average exam score for 
each TA’s students. The average student math SAT score, which was significantly related to 
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final exam performance (F= 8.0 ,p = .01 1), for each section and standard deviation of that score 
were used as covariates to control for initial differences between student sections. The adjusted 
mean scores and standard deviations for each group on the final exam in each department is 
listed in Table 2. On average, students from TAs in the feedback group in both departments 
scored slightly higher on the final exams, by approximately 2%, than students from TAs in the 
control group, however, this difference was not statistically different for either department. 

As mentioned earlier, we also hypothesized that there would be a positive relationship 
between TA utilization of the instructional activities and student learning, performance on the 
final exams. Although there was not a significant effect between groups on the final exam, there 
were significant positive relationships between ratings of the frequency of use of five of the six 
instructional activities and final exam scores in the computer literacy course (see Table 3), which 
supported the hypothesis. For each student section (N = 61), the TA’s mean frequency score for 
each activity and the total score on the IAFF were compared to the mean final exam score for all 
the students in that section by using Pearson Product moment correlations. The correlations 
between each activity and the final exam scores ranged from r = .28 for practice and feedback to 
r = .36 for present information (see table 3). The correlations were also positive, though much 
lower, for the chemistry lab course (N=32), but none were statistically significant. In examining 
these correlation, it is important to keep in mind that the two posttests employed as measures of 
learning had fairly low levels of reliability ( r = .59 for the computer science exam and r = .41 for 
the chemistry exam). In such circumstances, through a process known as attenuation, observed 
correlations are predictably lower than they would have been had the measurement instruments 
been more reliable. Although one must work with the data at hand, it is reasonable to anticipate 
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that the correlation between posttest performance and the extent of use of the Reiser and Dick 
instructional activities would be higher were measurement error more effectively controlled. 

Incidentally, the ratings of teaching effectiveness were also related to final exam scores, r 
(N = 61) = .40, p < .01. Thus, TAs who scored higher on student ratings of teaching effectiveness 
and more frequently used Reiser and Dick’s activities also had students who scored higher on the 
final exam. Although this finding does not imply a causal relationship, there is a significant 
positive relationship between use of the instructional practices and learning outcomes in the 
computer literacy course, which also provides support to the Reiser and Dick model. 

Student Motivation 

The hypothesis that the intervention would also positively affect student motivation for 
those students of TAs in the treatment group was not supported. The four items on the IAFF final 
assessment (# 33-36) that measured elements of student motivation were combined to produce a 
mean ‘student motivation’ score. Scores ranged from one to five, with higher scores signifying 
higher student motivation. The initial assessment score for each TA on the ‘initial student 
interest in the course’ item was used as the covariate to produce the adjusted mean score. The 
adjusted mean scores and standard deviations for each group on the final assessment were 3.07 
(SD = .43) for the control group (N = 19) and 3.23 ( SD = .37) for the feedback group (N = 18). 
Although the motivation score for students in the feedback group was higher than the motivation 
score for students in the control group, the difference was not statistically significant, F (1,32) = 
3.474, jf? = .072. 

A post hoc examination of each of the four motivation questions, however, revealed that 
students of TAs in the treatment group (N = 18) reported a higher ‘interest in the course’, M = 
3.07 (SD = .30), compared to students of TAs in the treatment group, M= 2.86 (SD = .28). This 
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difference in student interest after the intervention, then, was statistically significant, F (1,32) = 
4.356, p = .045, however, this finding is tenuous because of a possible inflation of the family- 
wise error rate due to four additional comparisons. As mentioned earlier, however, there was no 
significant difference between groups in initial interest in the course subject at the beginning of 
the term. 

As with the student learning correlations discussed earlier, the student motivation 
dimension rating was also positively related to the teachers’ use of all six instructional activities. 
Correlations ranged from r = .70 to r = .89. Thus, although the intervention did not have a 
significant effect on the overall student motivation dimension rating, it did have a significant 
effect on student ‘interest’ in the course content. As with the learning outcomes, the teacher’s use 
of Reiser and Dick’s activities were also positively related to student motivation. 

Incidentally, the International TAs (ITAs) seemed to benefit even more from the 
intervention than the USTAs. The random assignment of TAs to treatment groups resulted in 
relatively equal numbers of ITAs and USTAs in each treatment group in each department. There 
were 10 ITAs and 9 USTAs in the control group and 10 ITAs and 8 USTAs in the feedback 
group. Results of the initial assessment revealed that USTAs scored higher than ITAs on the 
IAFF ratings for each instructional activity and the differences were significant in ratings of 
‘motivating students,’ F(l,35) = 6.26, p < .05, and overall teaching effectiveness, F { 1,3 5) = 
5.25, p < .05. After the final assessment, however, there were no significant differences between 
ITAs and USTAs in their frequency of use of activities or their effectiveness ratings. 

Discussion 

The purpose of this study was to examine how mid-term student ratings feedback and 
consultation on instructional practices with college Teaching Assistants (TAs) affects teaching, 
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learning and motivation. In essence, could TAs translate student feedback and consultation, 
based on Reiser and Dick’s model, into specified instructional practices and would those 
practices enhance their teaching effectiveness, student learning and student motivation. Results 
of this study illustrate that such feedback and consultation can have a significant impact on 
ratings of teacher practices, and ratings of teaching effectiveness and that such practices are 
significantly related to learning outcomes and student motivation. 

Teaching Practices 

Results from the IAFF supported the hypothesis on teaching practices and revealed that 
TAs in the feedback group used each of the six activities significantly more frequently than TAs 
in the control group. The practical importance of differences caused by the intervention is 
evident when examining the effect sizes and the proportion of variance accounted for by the 
treatment {Eta 2 ), which ranged from 29% to 48% (see Table 1). The effect sizes for the 
individual activities, d = .53 to .90, and all the activities combined, d= .77, represent large 
effects according to Cohen’s (1988) criteria. The current study also exceeded effect sizes found 
in Cohen’s (1980) meta-analysis of studies with ‘augmented’ feedback, which had an average 
overall effect size of d = .64. 

A possible explanation for significant findings in the current study may have been due to 
the consultation that accompanied the ratings feedback. On average, studies that provided 
personal consultation with the student ratings feedback were nearly four times as effective in 
changing subsequent rating results than studies that did not provide consultation with the ratings 
(L’Hommedieu et al, 1990). Cohen (1980) adds: “Augmented feedback, or more specifically 
expert consultation, seems to be the key element for making student-rating data useable for 
improvement purposes” (p. 338). Weimer (1997) reviewed the impact of the five most common 
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instructional interventions on improving teaching. Of these interventions, which included 
personal consultation, teaching workshops/seminars, research grants, peers, and teaching 
resource materials, he concluded that: “The best empirical evidence for any intervention is that 
summoned on behalf of consultation over student ratings” (p. 424). During interviews with the 
TAs, interestingly, they highlighted the value of the consultation in helping them translate the 
feedback into their subsequent teaching practices. One TA noted that the consultation was “Very, 
very, very helpful. He [the consultant] helped me realize my weaknesses and found ways to 
correct them.” Another TA simply stated: “I believe this [consultation] is the most helpful part of 
the study-extremely helpful.” 

In addition to the consultation procedure, another possible explanation for the significant 
increase in the frequency of TAs using the instructional activities may be due to the ‘specificity’ 
of the feedback provided to the TAs. Researchers have noted that ‘specificity’ and ‘concreteness’ 
in ratings instruments and consultation practices enhances teachers’ ability to change their 
behaviors (Marsh & Roche, 1993; McKeachie, 1997; Murray, 1983). Cashin (1999) adds: 
“Specific, concrete, behaviorally oriented information is most useful in trying to improve your 
teaching . . .items included on the form should be descriptive of specific and concrete teaching 
behavior” (p.34). TAs who participated in the feedback group in this study attributed their ability 
to change their teaching practices to the specificity of both the IAFF items and the consultation 
related to their IAFF and their teaching. When asked to describe the difference between the 
rating and feedback system used in this study compared to the university feedback system a TA 
stated that the IAFF and consultation was “more effective [than the university system] - there 
was more specific feedback. I liked the organization into categories. . .[which] helped me 
recognize what areas I needed improvement in.” The results in this study were possibly due to 
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the combined aspects of providing specific feedback related to the IAFF ratings and providing 
consultation on specific classroom strategies to improve those ratings. 

Teaching Effectiveness 

The hypothesis that the intervention would positively affect ratings of teaching 
effectiveness was supported. TAs in the treatment group received significantly higher ratings of 
teaching effectiveness than TAs in the control group (see Figure 1). McKeachie’s et al. (1980) 
study of 40 psychology teachers and Overall and Marsh’s (1979) study of 30 computer- 
programming sections also found significant differences on teaching effectiveness in favor of the 
groups that received printed feedback and personal consultation related to student-ratings results. 
Taken together, these studies provide strong support for the use of this type of intervention. 

McKeachie (1997) provides another possible explanation for the differences between 
treatment groups by stating, “Good teaching involves building bridges between what is in your 
[the teacher’s] head and what is in the students’ heads” (p.1223). One of the TAs echoed 
McKeachie’s comment by saying that the feedback and consultation helped her “to know what I 
am-like a bridge between the students and me.” Other TA comments were: “It gave me a better 
idea of what the students’ thoughts and concerns were so I could teach them better;” “It has 
helped me see how I teach from the students’ perspective. I was able to change the way I 
structure the class to please some of the students.” Thus, if the TA is more aware of what the 
students need to enhance their learning, she is more likely to employ those practices or activities 
to build those ‘bridges’ between herself and the student to facilitate the learning process and 
ultimately become more effective at teaching. 

Why was the effect of the intervention greater among TAs in the chemistry department 
than TAs in the computer literacy department? The difference in total hours of teaching 
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instruction between departments may provide an explanation. The computer literacy classes met 
once each week for 75 minutes while the chemistry classes met once each week for 3 hours. 
Because of a new testing schedule and procedure in the computer literacy department, students 
rated the TAs based on four class meetings, or five hours of instruction, after the initial 
assessment. The chemistry students, however, rated their TAs on the final assessment based on 
seven class meetings, or 2 1 hours of instruction, after the initial assessment. Thus, chemistry 
students had over four times more exposure to their TAs to evaluate their teaching effectiveness. 

Both courses focused on hands-on applications of fundamental skills and procedures 
through guided practice. Because the chemistry course students actually had more face-to-face 
interaction with the TA than the computer literacy course students, they may have been more 
prepared to notice and rate changes in their TA’s teaching effectiveness. A limitation to this 
study is that both courses focused on lower-level learning outcomes and were highly structured, 
a setting conducive to more direct instruction. The intervention may have different effects on 
higher-level learning type courses that are less structured and more student-centered. 

Student Learning 

The hypothesis examining the effect of the intervention on student learning was not 
supported. Even though students of TAs in the treatment group scored higher than students of 
TAs in the control group (see Table 2), the difference was not significant. Likewise, 
McKeachie’s et al. (1980) study of 40 psychology teachers only found significant differences 
between groups for one of the four courses examined. Overall and Marsh (1979), however, did 
find significant differences between groups on final exam scores in their study of 30 computer- 
programming sections. Their study, however, used the student as the unit of analysis (N=751), 
whereas the TA was the unit of analysis in the current study. Using the TA as the unit of 
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analysis, as suggested by reviewers of the literature in this area (L’Hommedieu et al., 1990; 
Cranton & Smith, 1990), and examining learning outcomes by department, limited the total 
sample size for each analysis to 20 and 17, which also reduced power. 

Another possible reason that there was no difference in learning across the groups could 
be the nature of the final exams in both departments. Both departments in this study used new 
tests or testing procedures that may have distracted from the quality of the assessment. During 
the study, the computer literacy department used a new centralized and automated online testing 
procedure for the first time, and indeed the reliability of the 29-item final exam was only r = .59. 
The exam’s low reliability, then, may have affected the results related to student learning. 

The chemistry department also employed a new final exam. Previously, there had not 

/ 

been a final exam in this introductory laboratory course. However, for the purpose of this study, 
the department developed an assessment instrument consisting of ten multiple-choice items, with 
two questions for each of the five major lessons covered in the last half of the course. As evident 
by the overall mean score on the exam (65.56 %), students had difficulty with the test, which 
may not have effectively measured their learning from the course. In fact, the reliabilities of the 
eight versions of the exam ranged from r = .22 to .54, with an average reliability of r = .41. The 
quality of both the computer literacy and chemistry exams, then, may have actually limited the 
ability to detect differences in learning between the two treatment groups. 

As Abrami et al., (1990) highlighted in their review, the type of criterion measures used 
must also be considered when assessing learning outcomes. Final exams are not the only means 
to assess student learning, and multiple-choice tests, which were used in the chemistry 
department, often assess lower-order thinking skills. Although multi-section final exams have 
been the predominant means to measure student learning outcomes in student ratings research, 
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reviewers (Marsh & Dunkin, 1997; McKeachie, 1997) suggest other means should be examined. 
In future studies in this area, student learning might be measured using more comprehensive 
assessment measures throughout the term, including performance-oriented measures. 

Although the treatment condition didn’t have an effect on student learning, there was, as 
predicted, a positive relationship between student learning and the frequency of use of five of the 
six instructional activities in the computer literacy department. Interestingly, the findings in the 
current study for computer literacy closely parallel the findings of Cohen’s (1981, 1987) and 
Feldman’s (1989) meta-analyses, which also reveal positive correlations between the use of 
particular instructional activities and student learning. Table 3 compares the findings of the 
current study to these two meta-analyses, which also share almost identical correlations (r = .40, 
.40, and .39) between ratings of teaching effectiveness and student learning. It is possible that the 
significant positive relationships found between the instructional activities and final exam scores 
in the computer literacy course and not the chemistry course may be due to the lower reliability 
of the chemistry exams that may not have adequately assessed student learning. 

Student Motivation 

The hypothesis on the overall student motivation rating dimension was not supported, but 
the post hoc analysis revealed that the ‘student interest’ item was significantly higher for the 
feedback group. An explanation for the difference in student interest between the two groups, 
which was not different before the intervention, may be related to the extent to which TAs in the 
treatment group focused their attention on motivating their students. As shown in Table 1, 
students indicated that TAs in the treatment group used this activity to a much greater extent than 
TAs in the control group. One of the primary purposes of this instructional activity is to increase 
student interest in what is being taught. However, because this study focused on all six 
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instructional activities, we cannot conclude which specific activity actually affected the students’ 
interest. In this study, then, the use of Reiser and Dick’s instructional activities did not 
significantly affect student learning or student motivation, however, the TAs’ use of the activities 
were significantly related to student learning and motivation. 

Effect on International TAs 

As noted earlier, an incidental finding revealed that the current intervention was 
particularly helpful for the International TAs (ITAs). Before the intervention, the IT As in this 
study scored significantly lower than TAs from the United States (USTAs) on the student ratings 
of ‘motivating students’ and ‘teaching effectiveness,’ a result similar to previous research 
findings (Twale, Shannon & Moore, 1997). After the intervention, however, there was no 
significant difference between the student ratings of ITAs and USTAs. A possible explanation 
for the positive effects the treatment had on the ratings of the ITAs may be due to the nature of 
the consultation. Many ITAs experience much more formal teacher-student relationships in their 
academic cultures than the relationships evident in college classrooms in the United States 
(Twale et al., 1997). During the consultation, several ITAs mentioned that, in comparison to 
students in their own countries, United States students were much more ‘informal’ and ‘open’ in 
class. One ITA noted that the consultation “helped to show how real world examples could be 
tied into the teachings and reinforced ways to better tie the material into the student’s 
life/understanding.” The consultant’s discussion on how to incorporate the ‘motivating students’ 
activity, then, may have helped them ‘bridge’ this cultural gap while making class more relevant 
and enjoyable. 
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Future Research 

The feedback and consultation procedure had a short-term effect on the teaching practice 
of TAs, but will it influence TA instructional practices over a longer period of time? The full 
impact of the intervention may not be evident until the following semesters when the teachers 
gain more confidence and competence in using the instructional activities. Giving teachers more 
time to practice the activities may then produce a greater impact on student learning and 
motivation outcomes. Or, conversely, the treatment may lose its impact over time. Thus, 
longitudinal research examining the resilience of the intervention should be undertaken. 

In addition, other research procedures should be explored that economize the resources 
required to implement a similar intervention that yields similar results. Overall, the researcher 
averaged approximately five hours per teacher to; observe a class (1.25), provide feedback and 
consultation to the TA (1.25 hrs), observe another class (1.25 hrs), and provide follow-up 
consultation (.25). Considering the time it took to type the student comments and process the 
survey results for each TA, the total intervention time per TA averaged just under five hours. 
Finding sufficient university consultants to train a large number of TAs under procedures similar 
to those used in this study would be quite resource intensive. 

An alternative approach may eliminate nearly 50 percent of the consultant time per TA 
and be more realistic for university consultants or even department supervisors. Instead of the 2.5 
hour initial observation and one-on-one consultation at the beginning of the semester, all TAs 
could be offered a 75-minute workshop on incorporating the instructional activities in the 
classroom. This would leave two hours of dedicated one-on-one training per teacher, a much 
more realistic figure for university or department resources. 
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Implications 

The current study has important practical and theoretical implications for student ratings 
research and instructor consultation. The effect of the ratings feedback and consultation on 
ratings of teaching practices, and ratings of teaching effectiveness, and lend support to prior 
research that has highlighted the importance of personal consultation. As mentioned earlier, new 
teachers also prefer such consultation to all other teacher-training initiatives combined. Yet, 
Erickson’s (1986) survey of 630 four-year institutions found that only 38 percent of those 
institutions offered personal consultation in conjunction with student survey ratings. If colleges 
and universities do not have the resources to provide such personal consultation, our academic 
colleges and/or departments should consider providing new teachers with the personal 
consultation or mentoring that they desire and most likely need to enhance their effectiveness 
and confidence in the classroom. Such feedback and personal consultation can be even more 
beneficial to the many International TAs who are continuing to fill our TA and professor ranks, 
especially in the hard sciences. In addition to language proficiency training, Shannon, Twale & 
Moore (1998) emphasize that ITAs could benefit from more specific training on ‘how’ to teach. 

The current study also points to the need for continued research into more theoretically- 
based student rating instruments. Reiser and Dick’s instructional activities, which evolved from 
Gagne’s theory of instruction and the information-processing model of learning, served as the 
theoretical ‘anchor’ for developing the IAFF instrument and consultation process. Positive 
correlations between the activities and student learning warrant further examination of the effects 
of focusing on these activities on student ratings instruments and during TA training and 
consultation. 
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This study was the first step in an attempt to examine the effectiveness of a student 
ratings feedback form and consultation process specifically linked to a theory of learning and 
instruction. The intervention had a positive affect on ratings of teaching practices and teaching 
effectiveness, but it did not produce higher learning outcomes. Fortunately, using Reiser and 
Dick’s activities in the classroom was possibly related to student learning and motivation in this 
study. Continued research on the theoretically-based IAFF student rating instrument and 
consultation procedure may help uncover the most important finding of all, positive affect on 
learning. 
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Appendix 

Instructional Activities Feedback Form 



Instructions: Your feedback may assist your teacher’s preparation and delivery of future 
classes. Please answer all questions on the space provided using a # 2 pencil. Teachers 
will receive a typed summary of results and NOT see original handwritten comments. 
For the following questions, please estimate how often your teacher uses the following 
activities or behaviors during a typical class period. The ratings are: 

A=Almost Never B=Infrequently C=Occasionally D=Often E=Almost Always 



E Almost ALWAYS 
D Often 
C Occasionally 
B Infrequently 
A Almost NEVER 



1. Begins lesson with an interesting or exciting fact, demonstration, or question related to the lesson topic 

2. Uses interesting activities to maintain your interest and participation throughout class 

3. Is obviously enthusiastic and excited about lesson topics while teaching in class 

4. Offers praise, rewards, or recognition to students in class for excellent answers, ideas or performance 

5. Informs the class of the specific goal(s) or objective(s) for each lesson, e.g. By the end of this lesson, 
you should know or be able to do the following. . . 

6. Clearly indicates what you will be expected to learn during the class period 

7. Reviews information that was taught in a previous class(es) that will help you understand the new 
information or perform the new task. e.g. Remember, the three parts of a paragraph are 

8. Asks students questions about previous lesson topics that are related to the current lesson 

9. Relates new information to something you already know from previous classes or experiences 

10. Presents the information clearly to enhance your understanding of the lesson topic 



A B C D E 
A B C D E 
A B C D E 
A B C D E 
A B C D E 
A B C D E 
A B C D E 
A B C D E 
A B C D E 
A B C D E 



11. Provides the right amount of information to enable you to learn the key facts and/or skills being taught A B 

12. Provides adequate responses to student questions or comments to clarify the key facts and/or skills. . . A B 

13. Uses a variety of relevant examples to help you learn the key facts and/or skills being taught A B 

14. Provides guidance (tips) to students to help them remember the key facts and/or skills. A B 

e.g. Remember the colors of the rainbow with the name ‘ROY-G-BIV’ 

15. Provides adequate opportunities for you to practice in class the facts or skills being taught A B 

16. Provides relevant practice problems that enhance your understanding of the facts and/or skills A B 

17. Provides specific feedback about the practice problems in class to help you understand the correct answer A B 



18. Provides specific feedback about the practice problems in class to correct common student errors A B 

19. Summarizes the important aspects of the lesson before dismissing the class A B 

20. Reviews the objectives or goals of the lesson at the end of each lesson A B 

(Continued on Back) 



C D E 
C D E 
C D E 
C D E 
C D E 
C D E 
C D E 
C D E 
C D E 
C D E 
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Instructional Activities Feedback Form (Continued) 



Directions: Items 22-28 are typical instructional activities for a class. How much improvement (if any) do you think 
your teacher needs to improve each of these activities in order to enhance your learning in class. 



22. Motivating students to learn by making class interesting, relevant, realistic and fun A B C D E 

For 22-28: A=MUCH improvement needed B=SOME improvement needed C=NO improvement needed 

23. Informing students of the objectives for the lesson A B C D E 

A=MUCH improvement needed B=SOME improvement needed C=NO improvement needed 

24. Helping students recall prior knowledge essential for learning the new material A B C D E 

A=MUCH improvement needed B=SOME improvement needed C=NO improvement needed 

25. Presenting information and relevant examples about the facts and skills being taught A B C D E 

A=MUCH improvement needed B=SOME improvement needed C=NO improvement needed 

26. Providing an opportunity for in-class student practice of the skills being taught A B C D E 

A=MUCH improvement needed B=SOME improvement needed C=NO improvement needed 

27. Providing adequate feedback on practice problems to correct common errors A B C D E 

A=MUCH improvement needed B=SOME improvement needed C=NO improvement needed 

28. Summarizing the key points of the lesson at the end of the class A B C D E 

A=MUCH improvement needed B=SOME improvement needed C=NO improvement needed 

29. Overall, how effective would you rate your teacher’s instruction? A B C D E 

A=Poor B=Below average C= Average D=Above average E=Excellent 

30. Overall, how important do you think the activities in questions 22-28 are for the teacher to use in order for you 

to learn as much as possible in this class? ABODE 

A=Not at all important B=Not very important C=Important D=Very important E=Extremely important 



The following questions allow you to provide more detailed feedback to your instructor and the researcher. 

Please write your answers in the space provided or mark the appropriate letter to the right where appropriate. 

To ensure confidentiality, your instructor will be given typed comments and never see the handwritten comments. 



31. What is your gender? A=Female B=Male A B C D E 

32. What academic area do you believe you will major in? A B C D E 

A=Natural Sciences B=Computers/Engineering/Math C=Arts D=Humanities E=Education/Business 

33. Overall, how interesting is this course compared to other courses you have taken? A B C D E 

A=Not at all B=Less interesting C=Similar D=More interesting EHMost interesting 

34. How useful do you believe this course will be to your future academic studies and/or success in this field?... A B C D E 

A=Not useful B=Somewhat useful C=Useful D=Very useful E=Extremely useful 

35. What is your level of confidence that you can apply what you learned in this course in the future? A B C D E 

A=Not confident B=Somewhat confident C=Confident D=Very confident E=Extremely confident 

36. Overall, how satisfied are you with this course? A B C D E 

A=Extremely dissatisfied B=Dissatisfied C=Satisfied D=Very satisfied E=Extremely satisfied 

37. Describe what you really like about your teacher’s instruction. 



38. Describe what you do NOT like about your teacher’s instruction (if any). 



Thank you for your honest feedback! 
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Table 1 

Ratings of Frequency of Use of Instructional Activities: Mean and Standard Deviation by 
Treatment Group* 



Activity 


Feedback 

(N = 18) 


Control 

(N= 19) 


Eta 2 


Motivating students 


3.50 

(.39) 


3.05 

(.61) 


.36 


Telling students objectives 


4.19 

(.41) 


3.88 

(.56) 


.48 


Informing students of prerequisites 


3.56 

(.42) 


3.13 

(.55) 


.45 


Presenting information 


3.89 


3.51 


.45 


& examples 


(.38) 


(.63) 




Providing practice & feedback 


3.70 

(.49) 


3.40 

(.65) 


.29 


Summarizing the lesson 


3.74 

(.46) 


3.29 

(.64) 


.36 


Total mean score 


3.74 

(.40) 


3.36 

(.59) 


.43 



Note. Scale ranges from 1 (not at all) to 5 (almost always). Means adjusted based on initial 
assessment scores. Standard deviations in parentheses. Eta 2 = proportion of variance attributed 



to treatment 



* p < .01 for all group comparisons 
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Table 2 

Adjusted Student Performance on Final Exam Scores by Department* 



Department 


Feedback Grp 


Control Grp 


F 


P 


Computer Literacy 


81.07 

(2.95) 


79.68 

(3.08) 


1.380 


.257 


Chemistry 


66.40 

(2.96) 


64.71 

(5.23) 


.997 


.336 



*Exam scores based on 100-point scale and adjusted with math SAT and standard deviation of 
math SAT scores as covariates. Standard deviations of scores in parenthesis. 
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Table 3 

A Comparison of Correlations Between Student Rating Dimensions and Student Learning by 
Section (N=61) in Computer Literacy Course 



IAFF Activity/ 

(Comparable Rating Dimension) 


Hampton 

Study 

(2001) 


Cohen 

Meta-analyses 

(1981,1987) 


Feldman 

Meta-analyses 

(1989,1997) 


Motivating students 


.31* 






(Rapport) 




.30 




(Interest/Motivation) 




.26 




(Simulate interest in course) 






.38 


(Motivate student to do best) 






.38 


(Enthusiasm for subject or teaching) 






.27 


(Pleasantness of classroom) 






.23 


(Value/usefulness/relevance of course 






.17 


Objectives 


.22 


N/A 


.35 


Prerequisites 


.29* 


N/A 


N/A 


Present information 


.36** 






(Interaction) 




.45 




(Clarity & understandableness) 






.56 


(Discussion) 






.36 


(Encourage questions/opinions) 






.36 


(Elocution) 






.35 


Practice/Feedback 


.28* 






(Feedback) 




.29 


.23 


(Availability & helpfulness) 






.36 


Summarize lesson 


.27* 


N/A 


N/A 


Total IAFF Score (items 1-20) 
(Structure) 

(Pursued/met course objectives) 


.32* 


.55 


.49 


Teaching Effectiveness 


40** 


.40 


.39 



*p < .05, ** p < .01 
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Teaching Effectiveness 



Theory to Practice 39 




■ • ■ Control 



Feedback 



Figure 1. Adjusted Mean Teaching Effectiveness Scores for Feedback and Control 
Groups across Departments 
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